Human Action Recognition Using Pyramid Vocabulary Tree
نویسندگان
چکیده
The bag-of-visual-words (BOVW) approaches are widely used in human action recognition. Usually, large vocabulary size of the BOVW is more discriminative for inter-class action classification while small one is more robust to noise and thus tolerant to the intra-class invariance. In this pape, we propose a pyramid vocabulary tree to model local spatio-temporal features, which can characterize the inter-class difference and also allow intra-class variance. Moreover, since BOVW is geometrically unconstrained, we further consider the spatio-temporal information of local features and propose a sparse spatio-temporal pyramid matching kernel (termed as SST-PMK) to compute the similarity measures between video sequences. SST-PMK satisfies the Mercer’s condition and therefore is readily integrated into SVM to perform action recognition. Experimental results on the Weizmann datasets show that both the pyramid vocabulary tree and the SST-PMK lead to a significant improvement in human action recognition.
منابع مشابه
Adaptive Vocabulary Forests for Dynamic Indexing and Category Learning
Histogram pyramid representations computed from a vocabulary tree of visual words have proven valuable for a range of image indexing and recognition tasks; however, they have only used a single, fixed partition of feature space. We present a new efficient algorithm to incrementally compute set-of-trees (forest) vocabulary representations, and show they improve recognition and indexing performan...
متن کاملSelective spatio-temporal interest points
Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper, we present a novel approach for robust and selective STIP detection, by applying surround suppression combined with local and temporal constraints. This new method is significantly different from existing STI...
متن کاملCPPP/UFMS at ImageCLEF 2014: Robot Vision Task
This paper describes the participation of the CPPP/UFMS group in the robot vision task. We have applied the spatial pyramid matching proposed by Lazebnik et al. This method extends bag-of-visualwords to spatial pyramids by concatenating histograms of local features found in increasingly fine sub-regions. To form the visual vocabulary, kmeans clustering was applied in a random subset of images f...
متن کاملParallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers
This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...
متن کاملCognitive Image Representation Based on Spectrum Pyramid Decomposition
The contemporary image representation is based on various techniques, using matrices, vectors, multi-resolution pyramids, R-tree, orthogonal transforms, anisotropic perceptual representations, etc. In this paper is offered one new approach for cognitive image representation based on adaptive spectrum pyramid decomposition controlled by neural networks. This approach corresponds to the hypothesi...
متن کامل